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Abstract 

Federated machine learning has great potential for 
coalescing isolated data islands. It allows artificial 
intelligence models to be trained on encrypted data 
fragments in a distributed manner, thus enabling in¬ 
formation to be shared without compromising data 
security or user privacy. In addition to addressing 
security and privacy concerns, the federation also 
needs to provide sufficient economic incentives for 
its participants to join and stay in the federation. 

The problem of FML incentive mechanism design 
is therefore proposed so as to find out the best orga¬ 
nization and payment structure for the federation. 

In this paper, we set up a research framework for 
reasoning about FML incentive mechanism design. 

We introduce key concepts and their mathemati¬ 
cal notations specified under the FML environment, 
hereby giving a precise definition of the FML in¬ 
centive mechanism design problem. We provide 
three basic settings to break down the big compli¬ 
cated problem into two subproblems with signifi¬ 
cantly reduced complexities. We present a taxon¬ 
omy of FML incentive mechanisms based on differ¬ 
ent settings and properties carried by these mecha¬ 
nisms. We also provide four benchmark theorems 
for FML practitioners to choose among these cate¬ 
gories according to their preferred optimization ob¬ 
jectives as well as real-life conditions without hav¬ 
ing to deeply understand game theory. 

1 Introduction 

Recently, artificial intelligence (AI) has achieved great suc¬ 
cess in many fields. But those success stories are difficult to 
duplicate in many other fields due to data fragmentation and 
regulatory strictness. In most industries, data are segregated 
into isolated data islands. In addition, direct data sharing is 
restricted by laws and regulations such as the General Data 
Protection Regulation (GDPR) [EU, 2016]. To address these 
problems. Federated Machine Learning (FML) [Yang et al, 
2019] has been proposed. 

The main idea of FML is to build machine learning models 
based on data sets that are distributed across different sites 


while preserving data privacy. FML platforms have been 
developed and widely applied, e.g., TensorFlow Federated 
(TFF) from Google and FATE from WeBank. Industries such 
as finance, insurance, telecommunications, healthcare, edu¬ 
cation, and urban computing are at the edge of adopting and 
benefitting from FML technologies on a large scale. 

In industry practice, FML needs to be economically attrac¬ 
tive to its participants so that they have incentives to join and 
stay in the federation. Unlike ordinary data unions, FML par¬ 
ticipants own not only the data, but also the submodels. As 
a result, the entire FML model breaks even if one partici¬ 
pant leaves. These considerations give rise to the problem of 
FML incentive mechanism design , which studies how much 
the federation incent each of its participants. 

Mechanism design is a classical subfield of game theory. 
But FML incentive mechanism design bears very different 
problem settings than the conventional cases. Although sev¬ 
eral studies have been carried out, most of them are based 
on unrealistic assumptions or are too abstract to be realized. 
Even worser, the FML incentive design problem has not been 
clearly defined. Being an economic problem in nature, FML 
incentive mechanism design is based on assumptions that are 
approximately reasonable but never entirely true. Progress 
can be slow if mechanisms are challenged solely for their as¬ 
sumptions. 

In view of these issues, in this paper, we provide a standard 
definition of the problem of FML incentive design, setting 
up a basic framework for solving this problem, and provid¬ 
ing a taxonomy to classify FML incentive mechanisms. We 
prove four benchmark theorems which can help FML prac¬ 
titioners to choose the most appropriate mechanism among 
many. Concepts and notations in our framework bear practi¬ 
cal meanings, thus making it easy to be engineered as mod¬ 
ules of FML platforms. 

2 Related Works 

The concept of FML was first proposed by Google [Konecny 
et al., 2016a; Konecny et al., 2016b; McMahan et al., 2016] 
and further developed by [Yang et al., 2019]. Later, We¬ 
Bank published a white paper [WeBank, 2018], setting up 
a framework of FML. Numerous works followed. However, 
the problem of FML incentive mechanism design has not re- 



ceived wide attention yet. 

Mechanism design has a long history as a subfield of game 
theory. Hurwicz first introduced the notion of mechanism de¬ 
sign in 1960 [Hurwicz, I960], In 1961, Vickrey published the 
famous paper of VCG auction mechanism [Vickrey, 1961], 
Harsanyi developed the theory of games with incomplete in¬ 
formation during 1967-68 [Harsanyi, 1967; Harsanyi, 1968a; 
Harsanyi, 1968b]. Hurwicz introduced the notation of incen¬ 
tive compatibility in 1972 [Hurwicz, 1972]. These classical 
concepts still have a foundational value to mechanism design 
until now. 

The problem of FML incentive mechanism design has a 
lot to borrow from related academic domains such as coali¬ 
tion formation [Rahwan et al., 2015; Konishi and Ray, 2003], 
profit/cost sharing games [Augustine et al., 2011; von Falken- 
hausen and Harks, 2011], and incentive mechanism design 
in other machine learning problems [Dekel et al., 2010; 
Meir et al., 2012], However, results from existing researches 
cannot be copied mechanically because of the differences in 
game settings, controllable variables, nature of economic re¬ 
sources, and level of information asymmetry. 

3 Preliminaries 

Hereafter is a brief introduction of mechanism design in the 
language of game theory. Notations and terminologies used 
in this section are consistent with those in [Narahari, 2014]. 

The environment of general mechanism design is set up as 
follows: 

• There exists a set of n agents, denoted by N = 

(1,2 ,...,n); 

• Each agent i £ N has a type 9i £ 0, which determines 
his preferences; The collection of types of all agents 
forms the type profile 6 = (9\, 9-2, ■ ■ ■, 9 n ) € © = 
0 ! x 0 2 x • • • x 0„. 

• The result that agent i receives from the the game is de¬ 
noted by yi £ V, and called the outcome for player i. 
The outcomes for all agents compose the outcome pro¬ 
file, denoted by y = (y 1} y 2 ,. ■., y„) £ Y = fixf 2 x 
• • • x Y n . 

• Agents have preferences over outcomes that are repre¬ 
sented by a utility function ufi-) : Y x 0, 4 R. 

Both the type and the outcome affect the utility. The follow¬ 
ing rule of thumb can be used to distinguish them: for any 
independent variable in the utility function, 

• it is a type parameter if it is an inherent property of the 
agent and cannot be changed by actions of any agent; 

• it is an outcome parameter if it can be decided or manip¬ 
ulated by the mechanism designer. 

The type parameter is assumed to be private information 
of agent i. The mechanism designer can request agents to 
report their types, but there is no guarantee for truthfully re¬ 
porting. The reported type of agent i is denoted by 0, £ 0,;. 
Mechanism design is to decide the outcome profile based on 
reported types from all agents. 


Definition 1 (Mechanism Design). Mechanism design is to 
optimize the social choice function y(-) : 0 i-> T , which 
is a mapping y(-) : 0 4 Y that assigns to each possible 
reported type profile 0 = ( 61 , 62 , , 9 n ), an outcome profile 

from the set Y, in order to obtain some desirable properties 
or maximize some objective functions. 

A mechanism is composed of both the type parameters and 
the social choice function. However, the domain of types 
are usually specified by game settings. Mechanism design 
is mainly to design the social choice function. 

Given a mechanism, it is up to agent i’s strategy to choose 
what 9i to report. The reported type profile is called in equi¬ 
librium if no agent has incentive to report a different type. 
There are different kinds of equilibriums, as defined below. 

Definition 2 (Equilibrium). A reported type profile 0 is in 
Equilibrium iff one of following conditions is satisfied: 

• (Nash Equilibrium, NE) 

Ui(y( 6 i, 9_i), 9i) > ufyfy, 0-i ), 6 i),for\H, 9\; (1) 

• (Dominant Strategy Equilibrium, DSE) 

Ui(y 0 i,d-i), 6 i) > Uiiyih^df^^fforWiJfeY; 

( 2 ) 

• (Bayesian Equilibrium, BE) 

K §i \ui(y(9i, d^i), 6 i)\ 6 i] 

> Eg .lufiyfy, 6 _i, ) 6 i )\ 6 i \,for\/i, 9\; (3) 

• (Perfect Bayesian Equilibrium, PBE) in a dynamic 
game, at any point of time t, the future reported type pro¬ 
file {fY 5 }, s = t ,... ,T is in Bayesian Equilibrium with 
respect to the belief at time t on {<Y S }, s = t,... ,T. 

Also, it is conventional to assume quasilinear utilities, i.e., 
the outcome y, is a tuple (xi,pf), where Xi denotes the eco¬ 
nomic resources allocated to agent i and p, denotes the money 
paid to agent i. The social choice function is thereby decom¬ 
posed into the allocation function x(-) : 0 4 I and the 
payment function p(-) : 0 1-4 R”. A quasilinear utility func¬ 
tion is linear with the payment p,. 

Definition 3 (Quasilinear Utility). A quasilinear utility func¬ 
tion has the form 

ufiy, 9i) = vfx, 9i) +pi, (4) 

where vfix, 6 f) is called the valuation function of agent i. 

With quasilinear utilities, we can write down a series of 
social metrics to measure the social goodness of a mechanism 
from different angles, as listed below. 

Definition 4 (Social Metrics). A social metric is a mapping 
w(-) : (v,p) 1-4 R which combines individual utilities of all 
agents into a single real number. From different perspectives, 
we define social welfare, social surplus, and social revenue 
as follows: 

n n 

WEL(v,p) = y^Vj(x, 6 i) + y^Pi, 

i=1 i— 1 

n n 

SUR(v,p) = ^Ui(£c, 9i), REV(v,p) = - ^ Pi. (5) 

i =1 i =1 



We also introduce the unfairness function zu(-) : ( v,p ) H>• 
R, which measures the unfairness among individuals. 

4 Specifications for FML Mechanism Design 

4.1 Economic Roles 

Agents in the FML game play different economic roles as 
listed below. In practice, one single agent may have multiple 
roles, but it helps if these roles are divided clearly. 

• (Coordinator) The coordinator is the agent who collects 
data, builds the FML model, and provides prediction ser¬ 
vices to model users. 

• (Data Owner) Data owners are the agents who own the 
data useful for FML model, denoted by i = 1,..., n. 

• (Model User) Model users are the agents who can apply 
the FML model to generate economic benefits, denoted 
by j = 1,..., m. Data owners can be model users at the 
same time. 

• (Financier) The financier is the agent who provides in¬ 
tertemporal financing services under dynamic FML set¬ 
tings. 

4.2 Type Parameters and Controllable Allocations 

Type parameters for data owner i can be specified as (qi, ), 
where the data quality q t is a measure of the effective con¬ 
tribution of the data to the quality of FML model. This term 
bears its generalized meaning so that a bigger data size also 
results in a bigger q t . 7 * is the type of data owner i’s individ¬ 
ual cost function. The higher of it indicates a higher cost for 
data owner i to contribute a certain data quality. 

The type parameter for model user j can be specified as 
which affects the individual benefit function. For a piece of 
prediction from a given FML model, a higher means that 
such prediction brings greater benefits to model user j. 

The outcome parameters need to be controlled by the FML 
coordinator. Beyond designing the transfer payment function 
p(-), the FML coordinator can also decide whether to accept 
or reject data from some data owners (or only accept a portion 
of data from some data owners), as well as decide how many 
prediction services are provided to each model user. There¬ 
fore, the allocation function for data owners can be specified 
as rj(-) : © K> [0.1]™, where q%{-) = 1 means the data owner 
is accepted and qi (•) = 0 rejected. Similarly, the allocation 
function for model users can be specified as k(-) : 0 1 —>• R n , 
where Kj (■) is the access permission provided to model user 
j to get predictions from the FML model. 

The utility function for data owner i is specified as 

«»(■) =Pi(g°»7,7) ~ Ci(qiVi,li), (6) 

where c(-) is the individual cost function. Similarly, the utility 
function for model user j is specified as 

Uj{-) =e j {Q{qor]),K j ,t j )+p j (K,€), (7) 

where ej(-) is the individual benefit function and Q(q o rf) is 
the quality of the FML model. 


The federation revenue is the sum of all payments received 
from the model users, denoted by 

m 

£(-) = -Z>j (*,€)• (8) 

3 =1 

In order to further differentiate it from the social revenue de¬ 
fined in Eq. 4. We call the social revenue the federation profit 
because it is the federation revenue minus the total payments 
payed to data owners. 

n 

PROFIT(-) = B(-) — ^^Pi(q o rj, 7 ), (9) 

i=1 

4.3 Objectives 

Below is a list of desirable properties of FML incentive mech¬ 
anism design: 

• (Incentive Compatibility, IC) IC is attained if in equi¬ 
librium, all agents report their types truthfully, i.e., 
9 = 9. Different types of equilibriums define differ¬ 
ent IC conditions, which can be one of Nash Incentive 
Compatibility (NIC), Dominant Incentive Compatibility 
(DIC ), Baysian Incentive Compatibility (BIC), ox Perfect 
Baysian Incentive Compatibility (PBIC). 

• (Individual Rationality, IR) A mechanism is individually 
rational (IR) if this mechanism does not make any player 
worse off than if he quits the federation, i.e., 

Ui(q, 7 ) > 0 for Vi, and Uj(£) > 0 forVj, (10) 

• (Budget Balance, BB) A mechanism is weakly budget 
balanced (WBB) if for all feasible outcomes, the sum of 
payments is less than or equal to zero, i.e., 

n 

X^Fi(9,7)<0, for Vq, 7 . (11) 

i—1 

It is strongly budget balanced (SBB) if the equity holds. 

• (Social Optimization) Depending on the chosen social 
metric, social optimization can be allocative efficiency 
(AE) if the social welfare function is maximized, profit 
maximization (PM) if the federation profit is maximized, 
or social surplus maximization (SSM) if the social sur¬ 
plus is maximized. 

• (Fairness) Fairness is attained if the unfairness function 
is minimized. 

• (Collusion Resistant) A mechanism is collusion resistant 
if no subgroup of agents can gain higher utilities by con¬ 
ducting fraudulent activities in collusion. 

4.4 Problem Definition 

Table 1 lists the mathematical concepts introduced so far 
and their corresponding notations. With these concepts, we 
present a formal definition for the problem of FML incentive 
mechanism design. 

Definition 5 (FML Incentive Mechanism Design). FML in¬ 
centive mechanism design is to design the optimal p(-), q(-), 
and k(-), as functions of reported data qualities q and re¬ 
ported 7 , in order to achieve a subset of objectives in Sec¬ 
tion 4.3. 



Symbol 

Meaning 

i 

Index of data owner 

j 

Index of model user 

uf) 

Utility function 

v(-) 

Valuation function 

a(-) 

Individual cost function 

e j (■) 

Individual benefit function 

qi or qi 

Data quality 

Ti or 7 i 

Type of individual cost function 

0 or ij 

Type of individual benefit function 

Pi(-),Pj(-) 

Payment to FML participant 

Vi{~) 

Portion of data accepted by federation 

«j(') 

Level of model access permission 

Q{ •) 

Quality of FML model 


Federation revenue 

Qi , 0j or 0 ,, Oj 

Type, Qi = (q it and 6j = 

2 /i(')> Vji') 

Outcome, y, = (rpyPi) and y :j = (nj,Pj) 

w(-) 

Social metric 

w(-) 

Unfairness function 


Table 1: List of Mathematical Symbols 


5 Basic Settings for FML Mechanism Design 

The FML incentive mechanism design problem is compli¬ 
cated when all relevant factors are considered simultaneously. 
Such complexities impair the transparency and understand- 
ability necessary for FML practitioners. In this section, we 
set up three basic assumptions on FML mechanism design. 
These assumptions divide the original problem into two well- 
defined subproblems with significantly reduced complexity, 
and are good approximations of reality. 

5.1 Basic Assumptions 

Assumption 1 (Quasilinear Environment & Mechanism De¬ 
sign with Money). There exists a common unit of measure 
(which is called money) of individual utilities for all agents. 
The federation coordinator can make monetary transfers to 
FML participants. 

This assumption is equivalent to assuming a quasilinear en¬ 
vironment, in which the FML coordinator can make monetary 
transfers to increase or reduce individual utilities to any de¬ 
sired values. 

Assumption 2 (Separation between Data Supply and Model 
Demand). The data supply market and the model demand 
market are separated. When an FML participant is both a 
data owner and a model user, his decision as a data owner 
does not affect his decision as a model user, or vice versa. 

Under this assumption, when we design mechanisms on 
the supply side, it is safe to treat k and £ as fixed values. So 
Iff) can be written as B(q o rj). On the demand side, the 
model quality Q is assumed fixed. 

In reality, the decisions on the supply side and the demand 
side are usually related, but Assumption 2 is still reasonable. 
It is easier for FML participants to understand this two-part 
payment structure as a reward of contributing data minus a 
fee for utilizing the model. Accordingly, they can make two 


independent simpler decisions instead of a single comprehen¬ 
sive decision. 

Assumption 3 (Existence of Exogenous Capital Market). 
FML participants can borrow or lend money freely through 
an exogenous capital market. The interest rate is an exoge¬ 
nous variable. 

This assumption helps avoid problems caused by tempo¬ 
rally mismatched federation revenues and costs. Because a 
third party financier can provide inter-temporal financing ser¬ 
vices to FML participants. We can convert some dynamic 
problems into static ones by calculating the net present value 
(NPV) of future cash flows under an exogenous interest rate. 

This assumption is also reasonable in practice. Emerging 
internet banks can act as such third-party financiers, charging 
a reasonable interest rate. 

5.2 Subproblems 

The three basic assumptions in Section 5.1 divide the original 
FML incentive mechanism design problem into two subprob¬ 
lems as follows. 

Definition 6 (Supply Side FML Incentive Mechanism De¬ 
sign). Given that the federation revenue B(q o rj) is an ex¬ 
ogenous function, the supply side FML incentive mechanism 
design is to design the optimal pfiq,^) andrffiq,^), i = 
1 as functions of reported data qualities i = 

1 ,... ,n and reported cost types i = 1 ,..., n, in order 
to achieve a subset of objectives in Section 4.3. 

Definition 7 (Demand Side FML Incentive Mechanism De¬ 
sign). Given that the model quality Q is an exogenous con¬ 
stant, the demand side FML incentive mechanism design is to 
design the optimal Pj(£) and j = 1 ,... ,m, as func¬ 

tions of reported benefit types f, j = 1 ,,m, in order to 
achieve a subset of objectives in Section 4.3. 

6 Variable Settings and Taxonomy 

Assumptions listed in Section 5.1 are generally applicable to 
the problem of FML mechanism design. Other assumptions, 
lacking such generality, are not considered standard but need 
to be specified based on real conditions. 

Also, objectives listed in Section 4.3 cannot be simultane¬ 
ously fulfilled. FML mechanism designers have to choose a 
subset to optimize for and give up on the rest. 

Different game settings and objectives naturally classify 
FML incentive mechanism into categories. Mechanisms in 
the same category are objectively comparable with respect 
to how well they achieve the chosen objectives of their cate¬ 
gory. Comparisons of mechanisms from different categories 
are more of a subjective matter. To aid such comparisons, 
it is recommended that each FML incentive mechanism be 
accompanied by a checklist of its game settings and chosen 
objectives. 

6.1 Different Game Settings 

The following additional game settings need to be specified 
for FML incentive mechanism design problems. 



• (Level of Information Asymmetry) There may be in¬ 
formation asymmetry about types of individual benefit 
functions on the demand side, or not. The supply side 
may assume a) no information asymmetry, b) informa¬ 
tion asymmetry about data qualities only, c) about types 
of individual cost functions only, or d) about both. 

• (Level of Incomplete Information) The FML coordinator 
may be uncertain about the exact form of the federation 
revenue function. This is incomplete information. Data 
owners and model users may be uncertain about their 
own types. This is called incomplete self information. 

• (Mode of System Evolution) If the game is played for 
only once, it is a static mechanism. If the game is played 
repeatedly and parameters change over time, it is a dy¬ 
namic mechanism. 

• (Belief Updates) In a dynamic problem, as time passes 
by, FML practitioners can update their beliefs. This can 
be heuristic belief updates or Bayesian belief updates, 
in which agents update their information based on some 
heuristic rules or Bayesian rules respectively. 

• (Controllable Outcomes) p(-), r/(-), and n(-) are poten¬ 
tial controllable outcomes, but some of them may not be 
controlled for some situations. On the demand side, the 
FML coordinator may choose not to exert access permis¬ 
sion control on the FML model, or not to practice price 
discrimination so that the unit price of FML prediction 
services is the same for all model users. On the demand 
side, the FML coordinator may not have the authority to 
reject data owners, therefore rj(-) is not controllable. 

6.2 Different Objectives 

Designers of FML incentive mechanisms need to optimize a 
subset of objectives from Section 4.3. Further specifications 
of these objectives are required, as explained below. 

• (Incentive Compatibility, IC) The IC condition can be 
further specified as NIC, DIC , BIC, or PBIC. 

• (Individual Rationality, IR) The FML incentive mecha¬ 
nism may or may not satisfy the IR constraint. 

• (Budget Balance, BB) The FML incentive mechanism 
can be WBB, SBB, or not budget balanced. 

• (Social Optimization) Depending on the chosen social 
metric, the mechanism can be allocative efficient (AE), 
profit maximal (PM), or social surplus maximal (SSM). 

• (Fairness) A specific form of the unfairness function 
needs to be assigned to obtain fairness. 

• (Collusion Resistance) A mechanism can be, not be, or 
partially be resistant to certain kinds of collusions. 

Mechanism designers may choose to ignore some of these 
objectives. There are also cases where some conditions are 
approximately satisfied, i.e, the equalities or inequalities ap¬ 
proximately hold within certain bounds. 

6.3 The Checklist and Categories 

Table 2 is a template checklist with information to character¬ 
ize FML incentive mechanisms. Different specifications of 


this checklist classify FML incentive mechanisms into dif¬ 
ferent categories. It is recommended that such checklists are 
provided with FML incentive mechanisms so that FML prac¬ 
titioners can choose the most appropriate mechanism from 
the right category. 

6.4 Benchmarks for Choosing among Categories 

At last, we provide several benchmarks in order to assist FML 
practitioners to compare different categories. Let w*(0), 6 £ 
© denote the maximum value of the social metric attained by 
the optimal mechanism. Since 9 is unknown ex ante, we need 
to measure Ea(@)W*( 0), where A(0) is a prior belief on the 
distribution of 0. The following theorems can be proved. 

Theorem 1 (Larger outcome space, better social optimiza¬ 
tion). The bigger is the set Y, the larger is E A (©)tu*(0). 

Proof. Suppose Y 1 D Y 2 . since any possible social choice 
function y 2 (-) : 0 i—»• Y 2 £ ty 2 is also a social choice 
function in r 3T 1 , we have T!/ 2 C y . Therefore, 

Ea (@)W 2 *{9) = E A(&)w(y 2 *(9),9) < E a(&) w u (0). 

( 12 ) 

Q.E.D. □ 

From this theorem, the authority to reject data and the con¬ 
trol of model access increase optimal values of social met¬ 
rics. For example, if the federation has the right to reject data, 
Y 1 = [0,1]” x K”; if not, Y 2 = {l} n x R n . Therefore, the 
latter case results in a greater optimal social metric. 

Theorem 2 (Less inforamtion asymmetry, better social opti- 
mizaion). If the IC constraint is required, the bigger is the set 
0 and the less accurate of the belief about 0, the smaller is 

E a(@)W*{9). 

Proof. Suppose 0 1 C 0 2 and the belief on 0 1 is no less ac¬ 
curate than that on © 2 , then for any possible social choice 
function y 2 (-) : © 2 Y, we can find a corresponding 
y 1 (-) : 0 1 i —y Y^ so that 

y 1 (0) = y 2 (9), forVfl G 0 1 C © 2 , (13) 

where 6 is an estimation of the true type 9. This defines 
a many-to-one mapping ££ from the function space <3f 2 = 
{y 2 (-), y 2 (-) satisfies IC on 0 2 } to the function space = 
{2 / 1 (')? y 1 (') satisfies IC on 0 1 }. 

We can notice that if y 2 (-) satisfies the IC condition on 0 2 , 
Jz?(y 2 (-)) naturally satisfies the corresponding IC condition 
on 0 1 , which is less strict. Therefore, T£(f3/ 2 ') C r 3C 1 

The optimization problems, written with notations above, 
become 

E A (©i)W*( 0) = max {E A{& i ) w(y l (9),0)}, l = 1 , 2 . 

y L 

(14) 

Denote the solutions of Eq.14 by y 1 *(-)> y 2 *(-)> and the so¬ 
lution when W 1 is replaced by J? (ff 2 ) as y 12 *(-). We have 

E A( ©i ) iu 2 *( 6 ') = E a( ©i )W (y 2 *(9),9) 
=E A{& i ) w(Jf’(y 2 *)(0),9)<E A{& i ) w(y 12 *(0),9) (15) 




Game Settings & Objectives 

Specifications 


Information asymmetry about individual benefit functions 

Yes/No 

Demand 

Access permission control on FLM model 

Yes/No/Partial 

Side 

Price discrimination 

Yes/No 

Settings 

Specification of individual benefit functions 

Specification 


Specification of the model quality function 

Specification 


Information asymmetry about data qualities 

Yes/No 

Supply 

Information asymmetry about individual cost functions 

Yes/No 

Side 

Incomplete information about federation revenue function 

Yes/No 

Settings 

Ability to reject data owners 

Yes/No/Partial 


Specification of individual cost functions 

Specification 


Specification of the federation revenue function 

Specification 

General 

Settings 

Incomplete self information 

Yes/No 

Mode of system evolution 

Static/Dynamic 

Belief updates 

No/Heuristic/Bayesian 


IC 

N o/N 1C/DIC/BIC/PB IC/Approx. 


IR 

Yes/No/ Approx. 

Objectives 

BB 

WBB/SBB/Approx. 


Social Optimization 

AE/PM/SSM/Approx. 


Fairness 

No/Specification/Approx. 


Collusion Resistancy 

Yes/No/Partial 


Table 2: Checklist for Specifications of FML Incentive Mechanisms 


Be noted here that we calculate and compare the expectations 
under A(© 1 ), because E A ( 0 i)U> 2 *(0) is a better approxima¬ 
tion to the realized social metric w 2 *(0) than E A ( 0 2 )it> 2 *( 0 ). 

Furthermore, According to the previous theorem, 

E A{§1) w(y 12 **(9),9) 

<E a(0 i ) w{y 1 **(0), 9) = E A(0 i ^*(9). (16) 

Therefore, E A ( 0 i)W 2 *( 6 ') < E A ( 0 i)tn 1 *( 6 '). Q.E.D. □ 

According to this theorem, the FML coordinator should 
only ask participants to report their types when he is uncertain 
about these types. If he knows the exact types of data owners 
and model users. For example, for the supply side problem, 
the set of all possible types is © 2 = {(< 7 , 7 ), q <G Q, 7 € T}. 
If individual cost type can be estimated accurately as 7 £ T, 
the type set becomes © 1 = {(< 7 , 7 ), q £ Q} C © 2 . Accord¬ 
ing to Theorem 2, the optimal value of the social metric can 
be improved in this case. 

Theorem 3 (More constraints, worse social optimization). 
The more constraint conditions (such as IC, IR, BB), the 
smaller is E A ( 0 )Ui*( 0 ). 

Proof. Let PV be the set of all available social choice func¬ 
tions y(-) : © 1 —>• Y without any constraint condition. 
< rf 1 = {Cl ,..., C^} and ‘if 2 = {C 2 ,..., Cf 2 } are two sets 
of constraint conditions so that 'if 1 C if 2 . 

Let be the set of social choice functions that satisfy 
constraint condition C\, l = 1,2, s = 1 ,ki and PV l de¬ 
notes the set of social choice functions that satisfy all con¬ 
straint conditions in if 1 . Therefore, '¥ l = d^L-, iff Since 
{ Wg,s = 1,..., ki} C {&. ' 2 , s = 1,..., k?}, we also have 
W 2 C '-¥ x and Eq.12. According to Theorem 1, Q.E.D. □ 


There is no free lunch when the LML coordinator chooses 
more objectives from Table 2. LML coordinators need to de¬ 
cide how many constraint conditions to include. 

Theorem 4 (Less incomplete information, better social 
optimization). The more certain is the FML coordinator 
about parameters included in social metrics, the larger is 
E A (@)W*{0). 

Proof. Suppose other than 9, the social metric also depends 
on another set of parameters ip. The LML coordinator is un¬ 
certain about the true value of ip, but instead, he has a belief 
A(ip) on ip so that E A ^[r/>] = ip. In this case, the opti¬ 
mization problem in Eq. 14 becomes 

max {E a( ^)E A ( 0 )Mi/(0), 9, ip)}}. (17) 

Let A 1 (ip) and A 2 (ip) belongs to the same family of prob¬ 
ability distributions, except that the latter has a bigger vari¬ 
ance, therefore, 

ll E Ai(* E A(©)Ny W, 9, ip)} - E A(&) [w(y{9),9, r/>))]|| 

< ll E A 2 (l /,)E A(0) [w(y(9), 9, ip)} -E A(0) [w(y(0),e l ,'i/?)]|| 

(18) 

Denote the solution of Eq.17 by y l *(-) and y 2 *(■), and the 
solution of Eq. 14 by y*f). From Eq. 18, we have 

||y 1 *(-)-y*(-)ll<lly 2 *(-)-y*(-)ll,and (19) 

lEA^wiy 1 *^),^^) - E A(@) w(y*(9),9,ip)\ 

< \E A{&) w(y 2 *[9),9,ip)-E A{&) w(y*(9),9,ip)\. (20) 
Since 

E A{&) w l *(9) =E A {&)w(y u (9),9,ip) 

< E A(&)w(y*(9),9, ip), fori = 1, 2. (21) 

Combining Eq. 20 and Eq. 21, Q.E.D. □ 



As an application of this theorem, when we maximize the 
federation profit on the supply side. If the FML coordinator 
has a more accurate estimation on the form of Bf), he should 
utilize it, thereby increasing the maximum federation profit. 

7 Conclusions and Future Work 

In this paper, we presented a precise definition of the prob¬ 
lem of FML incentive mechanism design and set up a basic 
research framework for it. We related general concepts in 
the field of mechanism design with meaningful parameters in 
the FML environment. We defined assumptions, constraint 
conditions, and the objective functions for this specific prob¬ 
lem. We brought up three reasonable assumptions to divide 
the problem into two subproblems with significantly reduced 
complexity, one on the demand side and the other on the sup¬ 
ply side. 

We described the categorizing criteria for FML incen¬ 
tive mechanisms, proposed a checklist to be added to every 
FML incentive mechanism, and proved four benchmark the¬ 
orems to aid FML practitioners to choose among FML incen¬ 
tive mechanisms without having to understand their internal 
workings. 

In subsequent researches, we plan to work out a series of 
concrete examples of FML incentive mechanisms designed 
under this framework, as well as provide an engineering 
framework to conveniently implement these mechanisms in 
practice. We will also dig deeper into some important details 
such as reducing information asymmetry and incomplete in¬ 
formation, measures of data quality and model quality, and 
intelligent recommendations on optimal strategies for data 
owners and model users. 
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